RAG2

This Project

This web page originated as an assignment in Emory University's Biology 142 lab course. Students were assigned proteins of interest and asked to research what is known about the protein and to examine whether the newly sequenced whale shark genome had evidence of an orthologous protein

Background Information:

RAG2 (recombination activating gene) is a protein coding gene involved in initiation of V(D)J recombination during B and T cell development (Tabori et. al., 2004). V(D)J recombination, or somatic recombination, is a unique mechanism that occurs only in developing lymphocytes in the beginning stages of T and B cell development. This process works to develop a wide, diverse range of antibodies and immunoglobulins in B cells, as well as T cell receptors found in T cells. This process is integral to the development of an adaptive immune system that can actively and efficiently react to a diverse range of immune attacks, and occurs primarily in lymphoid organs. The reaction to initiate recombination through RAG1 and RAG2 includes DNA binding, synapsis, and cleavage at two RSSs located on the same DNA molecule; this then results in antigen receptor genes (Lovely et. al., 2015). In addition, Byrum et. al. (2015) found that interaction of RAG2 with RAG1 induces interactions of multiple binding sites that result in the formation of the DNA cleavage active site. The protein, RAG2, forms a complex that can cleave DNA at conserved recombination signal sequences. Mutations in RAG2 are known to cause Omen syndrome, which is a form of severe combined immunodeficiency (Chou et. al., 2012). The RAG2 gene is a major part of the RAG1-RAG2 protein complex, which initiates site-specific recombination through the cutting of DNA at specific sites (Kim et. al., 2014).

Methods/Approach

The human protein sequence for RAG2 (ENSP00000308620) was attained through the Ensembl database. This protein sequence was used as a query in a BLAST against the whale shark protein database with the Georgia Aquarium Galaxy server. Predicted protein hits were chosen, and their sequences were obtained through the Galaxy server. These predicted protein sequences were then used as queries in subsequent BLASTs against the human protein database through NCBI in order to establish a relationship between the protein sequences.

Orthologs were searched for by using the human protein sequence for RAG2 (ENSP00000308620) as the query sequence for BLASTs against the protein databases of other species.

A phylogenetic tree was created with CLUSTALW using the top predicted protein hits of the BLASTs against the protein databases of other species using the human protein sequence as the query.

Searching for RAG2 in the Whale Shark

The human protein sequence for RAG2 was used as the query in a blast against the whale shark protein database through the Georgia Aquarium Galaxy server. Five best hits were chosen based on their e-value, query coverage, and percent identity as listed in Table 1.

Table 1
Whale Shark ID
E-value
Alignment Length
Predicted Protein Length
% Identity
g47105.t1
2e-04
53
1095
35.85
g43408.t1
4e-05
39
167
38.40
g26477.t1
3e-04
43
86
37.21
g47950.t1
1e-05
38
208
34.21
g38774.t1
2e-04
39
552
43.51
Table 1: Top predicted protein hits for the whale shark protein database. The human protein sequence for RAG2 was used as a query in a BLAST against the whale shark protein database. E-values, alignment length, and % identity were analyzed in order to select five hits. The specifics for these hits are listed here.

These best hits were used as the queries in subsequent BLASTs against the human protein database. The g43408.t1 whale shark protein sequence returned RAG-2 as the predicted protein in the human protein database with a query cover of 23%, an e-value of 3.5, and a percent identity of 38%. Due to the extremely insignificant e-value of 3.5 and low query coverage, we were not confident that the g43408.t1 whale shark protein sequence was an ortholog of RAG-2 in the whale shark. The other best hits, when BLASTed against the human protein database, did not return RAG-2 as the top predicted protein.

We repeated this process using the elephant shark predicted RAG-2 protein as the query sequence in a BLAST against the whale shark protein database. The elephant shark predicted RAG-2 protein (XP_007885835.1) that resulted from a BLAST using the human RAG-2 protein sequence as the query had a query coverage of 100%, an e-value of 0.0, and a percent identity of 53%. Therefore, we were fairly confident that the elephant shark genome contains the RAG-2 protein sequence. We used this predicted RAG-2 protein as the query in a BLAST against the whale shark protein database in order to see if the search would return different best hits since elephant sharks are more closely related to whale sharks than humans are. New top hits in the whale shark protein sequence were found and recorded in Table 2.

Table 2
Whale Shark ID
E-value
Alignment Length
Predicted Protein Length
% Identity
g25937.t1
8e-05
36
105
36.11
g43166.t1
6e-05
67
614
31.34
g43984.t1
8e-04
53
259
32.08
g41179.t1
1e-04
36
194
38.89
Table 2: Top predicted protein hits for the whale shark protein database using the elephant shark predicted protein sequence for RAG2 as the query in the BLAST. We were confident that the elephant shark has a RAG2 ortholog. The best hits from this BLAST did not match those from the BLAST using the human RAG2 protein sequence as the query (as shown in Table 1).

These best hits were then used as query sequences in subsequent BLASTs against the elephant shark protein database. None of these BLASTs returned the RAG2 predicted protein in the elephant shark as the best hit. All of the BLASTs resulted in different putative conserved domains; none of the results had matching protein domains. Therefore, we hypothesized that the whale shark most likely does not have the RAG-2 protein.

Protein Domains

None of the protein domains of the cross-BLASTs against the elephant shark and whale shark protein databases matched, but putative conserved domains were found for each of the best hits. The results included actin-related proteins and proteins involved in chromosome segregation (SMC proteins) and the activity of phosphodiesterase and ribonuclease. SMC proteins, or structural maintenance of chromosomes, deal with chromosome structure an dynamics. Specifically, they play important roles in "chromosome condensation, sister-chromatid cohesion, sex-chromosome dosage compensation, genetic recombination and DNA repair" (Strunnikov, 1998).

Orthologs

The human protein sequence for RAG-2 was used as the query in multiple BLASTs against the protein databases of mice, zebrafish, clawed frogs, yeast, elephant sharks, and fruit flies in order to search for orthologs. The best hits for these BLASTs are shown in Table 3.

Table 3
Species
Top Predicted Protein
ID
Length
E-value
Query coverage
% Identity
Human
V(D)J recombination-activating protein 2
NP_000527
527
0.0
100%
N/A
Mouse
V(D)J recombination-activating protein 2
NP_033046.1
527
0.0
100%
88
Zebrafish
V(D)J recombination-activating protein 2
NP_571460.2
530
0.0
100%
51
Clawed frog
V(D)J recombination-activating protein 2
XP_002937337.2
520
0.0
100%
60
Yeast
Bye1p
NP_012921.3
594
0.14
13%
29
Elephant shark
V(D)J recombination-activating protein 2
XP_007885835.1
520
0.0
100%
53
Fruit fly
CG1812
NP_608397.1
616
6.2
25%
23
Table 3: Predicted orthologs of the RAG-2 protein in mice, zebrafish, clawed frogs, yeast, elephant sharks, and fruit flies using the human RAG-2 protein sequence as the query.

It seems that the mouse, zebrafish, clawed frog, and elephant shark all have the RAG-2 protein, which belongs to the RAG2 superfamily as shown in Figure 1. It is interesting to note that the whale shark does not have the RAG2 protein while its closely related species, the elephant shark, does. Yeast and fruit flies do not appear to have orthologs.

Figure 1
external image RAG%202%20Family_zpscqqgxxlw.png
Figure 1: RAG2 superfamily. All species with orthologs of the human RAG2 protein possessed this RAG2 superfamily. Image from NCBI BLAST.

Phylogeny

The best hits from the ortholog search against other species, along with the top five hits from the BLAST against the whale shark protein database, were used to create a phylogenetic tree through CLUSTALW as shown in Figure 2.

Figure 2

external image RAG2%20Tree_zpsnylx8lrb.png
Figure 2: Phylogenetic tree constructed using the best hits from BLASTs against mice, zebrafish, clawed frogs, yeast, elephant sharks, and fruit flies along with the five top predicted protein hits from the whale shark protein database. These BLASTs used the human RAG2 protein sequence as the query.

Two of the top predicted whale shark protein hits have common ancestors with the species that have RAG2 orthologs. This grouping indicates that the whale shark protein sequence may share a similarity to the RAG2 protein sequence for these specific whale shark IDs. However, some of the sequence may have been lost with time. In addition, the g47950.t1 and g47105.t1 whale shark protein sequences group together and share a common ancestor with the species that have RAG2 orthologs.

Conclusions

We were not able to identitfy an ortholog of the RAG2 protein in the whale shark. We were able to identify several putative conserved domains, most of which deal with chromosome structure and function during segregation. Since they do not seem to have the RAG2 protein, whale sharks may not undergo V(D)J recombination, a characteristic of adaptive immunity. Based on our research with TLR10, we believe that the whale shark shares many characteristics of innate immunity present in humans. However, since whale sharks seemingly may not undergo V(D)J recombination, their adaptive immune system may differ greatly from the human counterpart. More research needs to be done on the numerous putative conserved domains found in the best hits from the whale shark protein database using the human and elephant RAG2 protein sequences as queries. Other proteins involved in adaptive immunity may also be considered and researched in order to further establish the possible difference between adaptive immunity in whale sharks and humans.

References:

Byrum, J. N., Zhao, S., Rahman, N. S., Gwyn, L. M., Rodgers, W., & Rodgers, K. K. (2015). An interdomain boundary in RAG1 facilitates cooperative binding to RAG2 in formation of the V(D)J recombinase complex. Protein Science. doi:10.1002/pro.2660

Chou, J., Hanna-Wakim, R., Tirosh, I., Kane, J., Fraulino, D., Lee, Y. N., . . . Massaad, M. J. (2012). A novel homozygous mutation in recombination activating gene 2 in 2 relatives with different clinical phenotypes: Omenn syndrome and hyper-IgM syndrome. Journal of Allergy and Clinical Immunology, 130(6), 1414-1416. doi:10.1016/j.jaci.2012.06.012

Kim, M., Lapkouski, M., Yang, W., & Gellert, M. (2014). Crystal structure of the V(D)J recombinase RAG1–RAG2. Nature, 518(7540), 507-511. doi:10.1038/nature14174

Lovely, G. A., Brewster, R. C., Schatz, D. G., Baltimore, D., & Phillips, R. (2015). Single-molecule analysis of RAG-mediated V(D)J DNA cleavage. Proceedings of the National Academy of Sciences, 112(14). doi:10.1073/pnas.1503477112

Recombination Activating Gene 2. (n.d.). Retrieved from NCBI website: __http://www.ncbi.nlm.nih.gov__/gene/5897

Strunnikov, A. V. (1998). SMC proteins and chromosome structure. Trends Cell Biology, 8(11), 454-459. Retrieved April 13, 2015.

Tabori, U., Mark, Z., Amariglio, N., Etzioni, A., Golan, H., Biloray, B., . . . Dalal, I. (2004). Detection of RAG mutations and prenatal diagnosis in families presenting with either T-B- severe combined immunodeficiency or Omenn's syndrome. Clinical Genetics, 65(4), 322-326. doi:10.1111/j.1399-0004.2004.00227.x

Websites used for research:

http://www.ensembl.org/index.html
http://www.genome.jp/tools/clustalw/
http://blast.ncbi.nlm.nih.gov/Blast.cgi
http://whaleshark.georgiaaquarium.org/root/index